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Abstract — Two broad classes of graphical modeling problems 
for codes can be identified in the literature: constructive and 
extractive problems. The former class of problems concern the 
construction of a graphical model in order to define a new 
code. The latter class of problems concern the extraction of a 
graphical model for a (fixed) given code. The design of a new 
low-density parity-check code for some given criteria (e.g. target 
block length and code rate) is an example of a constructive 
problem. The determination of a graphical model for a classical 
linear block code which implies a decoding algorithm with desired 
performance and complexity characteristics is an example of an 
extractive problem. This work focuses on extractive graphical 
model problems and aims to lay out some of the foundations of 
the theory of such problems for linear codes. 

The primary focus of this work is a study of the space of all 
graphical models for a (fixed) given code. The tradeoff between 
cyclic topology and complexity in this space is characterized via 
the introduction of a new bound: the tree-inducing cut-set bound. 
The proposed bound provides a more precise characterization of 
this tradeoff than that which can be obtained using existing tools 
(e.g. the Cut-Set Bound) and can be viewed as a generalization of 
the square-root bound for tail-biting trellises to graphical models 
with arbitrary cyclic topologies. Searching the space of graphical 
models for a given code is then enabled by introducing a set 
of basic graphical model transformation operations which are 
shown to span this space. Finally, heuristics for extracting novel 
graphical models for linear block codes using these transforma- 
tions are investigated. 

I. Introduction 

Graphical models of codes have been studied since the 
1960s and this study has intensified in recent years due 
to the discovery of turbo codes by Berrou et al. [1], the 
rediscovery of Gallager's low-density parity-check (LDPC) 
codes [2] by Spielman et al. [3] and MacKay et al. [4], 
and the pioneering work of Wiberg, Loeliger and Koetter 
[5], [6]. It is now well-known that together with a suitable 
message passing schedule, a graphical model implies a soft- 
in soft-out (SISO) decoding algorithm which is optimal for 
cycle-free models and suboptimal, yet often substantially less 
complex, for cyclic models (cf. [6], [7], [8], [9], [10]). It has 
been observed empirically in the literature that there exists a 
correlation between the cyclic topology of a graphical model 
and the performance of the decoding algorithms implied by 
that graphical model (cf. [5], [10], [11], [12], [13], [14], 
[15], [16]). To summarize this empirical "folk-knowledge", 
those graphical models which imply near-optimal decoding 



algorithms tend to have large girth, a small number of short 
cycles and a cycle structure that is not overly regular. 

Two broad classes of graphical modeling problems can be 
identified in the literature: 

• Constructive problems: Given a set of design require- 
ments, design a suitable code by constructing a good 
graphical model (i.e. a model which implies a low- 
complexity, near-optimal decoding algorithm). 

• Extractive problems: Given a specific (fixed) code, extract 
a graphical model for that code which implies a decod- 
ing algorithm with desired complexity and performance 
characteristics. 

Constructive graphical modeling problems have been widely 
addressed by the coding theory community. Capacity ap- 
proaching LDPC codes have been designed for both the 
additive white Gaussian noise (AWGN) channel (cf. [17], [18]) 
and the binary erasure channel (cf. [19], [20]). Other classes 
of modern codes have been successfully designed for a wide 
range of practically motivated block lengths and rates (cf. [21], 
[22], [23], [24], [25]). 

Less is understood about extractive graphical modeling 
problems, however. The extractive problems that have received 
the most attention are those concerning Tanner graph [11] and 
trellis representations of block codes. Tanner graphs imply 
low-complexity decoding algorithms; however, the Tanner 
graphs corresponding to many block codes of practical interest, 
e.g. high-rate Reed-Muller (RM), Reed-Solomon (RS), and 
Bose-Chaudhuri-Hocquenghem (BCH) codes, necessarily con- 
tain many short cycles [26] and thus imply poorly performing 
decoding algorithms. There is a well-developed theory of 
conventional trellises [27] and tail-biting trellises [28], [29] for 
linear block codes. Conventional and tail-biting trellises imply 
optimal and, respectively, near-optimal decoding algorithms; 
however, for many block codes of practical interest these 
decoding algorithms are prohibitively complex thus motivating 
the study of more general graphical models (i.e. models with 
a richer cyclic topology than a single cycle). 

The goal of this work is to lay out some of the foundations 
of the theory of extractive graphical modeling problems. 
Following a review of graphical models for codes in Section 
Hn a complexity measure for graphical models is introduced in 
Section The proposed measure captures a cyclic graphical 
model analog of the familiar notions of state and branch 
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complexity for trellises [27]. The minimal tree complexity of a 
code, which is a natural generalization of the well-understood 
minimal trellis complexity of a code to arbitrary cycle-free 
models, is then defined using this measure. 

The tradeoff between cyclic topology and complexity in 
graphical models is studied in Section |IV] Wiberg's Cut- 
Set Bound (CSB) is the existing tool that best characterizes 
this fundamental tradeoff [6]. While the CSB can be used 
to establish the square-root bound for tail-biting trellises [28] 
and thus provides a precise characterization of the potential 
tradeoff between cyclic topology and complexity for single- 
cycle models, as was first noted by Wiberg et al. [5], it is 
very challenging to use the CSB to characterize this tradeoff 
for graphical models with cyclic topologies richer than a single 
cycle. In order to provide a more precise characterization of 
this tradeoff than that offered by the CSB alone, this work 
introduces a new bound in Section |IV]- the tree-inducing cut- 
set bound - which may be viewed as a generalization of the 
square-root bound to graphical models with arbitrary cyclic 
topologies. Specifically, it is shown that an r*'*-root complexity 
reduction (with respect to the minimal tree complexity as 
defined in Section Ullb requires the introduction of at least 
r{r ~ l)/2 cycles. The proposed bound can thus be viewed 
as an extension of the square-root bound to graphical models 
with arbitrary cyclic topologies. 

The transformation of graphical models is studied in Section 
|V]and|VT] Whereas minimal conventional and tail -biting trellis 
models can be characterized algebraically via trellis-oriented 
generator matrices [27], there is in general no known analog 
of such algebraic characterizations for arbitrary cycle-free 
graphical models [30], let alone cyclic models. In the absence 
of such an algebraic characterization, it is initially unclear as to 
how cyclic graphical models can be extracted. In Section |V] a 
set of basic transformation operations on graphical models for 
codes is introduced and it is shown that any graphical model 
for a given code can be transformed into any other graphical 
model for that same code via the application of a finite number 
of these basic transformations. The transformations studied in 
Section [V] thus provide a mechanism for searching the space 
of all all graphical models for a given code. In Section |Vl] 
the basic transformations introduced in Section |V] are used to 
extract novel graphical models for linear block codes. Starting 
with an initial Tanner graph for a given code, heuristics for 
extracting other Tanner graphs, generalized Tanner graphs, 
and more complex cyclic graphical models are investigated. 
Concluding remarks and directions for future work are given 
in Section [Vn] 

II. Background 

A. Notation 

The binomial coefficient is denoted (^) where a,b ^ Z are 
integers. The finite field with q elements is denoted Fg. Given 
a finite index set /, the vector space over Fg defined on / is 
the set of vectors 

vl = {f = iUe¥„iel)}. (1) 



Suppose that J C / is some subset of the index set /. The 
projection of a vector / G F^ onto J is denoted 

/|,7 = (./n«e J). (2) 

B. Codes, Projections, and Subcodes 

Given a finite index set /, a linear code over Fg defined 
on / is some vector subspace C C Fg. The block length and 
dimension of C are denoted n{C) = |/| and fe(C) = dimC, 
respectively. If known, the minimum Hamming distance of 
C is denoted d{C) and C may be described by the triplet 
[n{C), k{C), d{C)]. This work considers only linear codes and 
the terms code and linear code are used interchangeably. 

A code C can be described by an tq x n{C), tq > k{C), 
generator matrix Gc over Fg, the rows of which span C. An 
rc X ri(C) generator matrix is redundant if ro is strictly greater 
than k{C). A code C can also be described by an th x n{C), 
fH > "-(C) — k{C), parity-check matrix He over Fg, the rows 
of which span the null space of C (i.e. the dual code C^). Each 
row of He defines a q-ary single parity-check equation which 
every codeword in C must satisfy. An th x n{C) parity-check 
matrix is redundant if th is strictly greater than 

k{C^)^n{C)-k{C). (3) 

Given a subset J C / of the index set /, the projection of 
C onto J is the set of all codeword projections: 

C|j = {c|j,ceC}. (4) 

Closely related to C| / is the subcode Cj: the projection onto 
J of the subset of codewords satisfying = for i G / \ J. 
Both C|j and Cj are linear codes. 

Suppose that Ci and C2 are two codes over Fg defined on 
the same index set /. The intersection Ci n C2 of Ci and C2 is 
a linear code defined on / comprising the vectors in F^ that 
are contained in both Ci and €2- 

Finally, suppose that Ca and Cb are two codes defined on 
the disjoint index sets Ja and Ji,, respectively. The Cartesian 
product C = Cq x Cb is the code defined on the index set 
J = {Ja, Jb} such that C| j„ = Cj„ = Ca and C, = Cj, = Ct- 

C. Generalized Extension Codes 

Let C be a linear code over Fg defined on the index set /. 
Let J C / be some subset of / and let 

= 7^0GFg,jG J) (5) 

be a vector of non-zero elements of Fg. A generalized exten- 
sion of C is formed by adding a q-ary parity-check on the 
codeword coordinates indexed by J to C (i.e. a q-ary partial 
parity symbol). The generalized extension code C is defined 
on the index set / = / U {p} such that if c = (q, i G /) G C 
then c = (ci , i G /) G C where ci = Ci if i <E I and 

The length and dimension of C are n{C) = n{C) + 1 and 
k{C) = k{C), respectively, and the minimum distance of C 
satisfies d{C) G {d{C), d{C) + 1}. Note that if J = / and (3j = 
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1 for all j G J, then C is simply a classically defined extended 
code [31]. More generally, a degree- g generalized extension of 
C is formed by adding g q-ary partial parity symbols to C and 
is defined on the index set lU{pi,p2, . . . ,Pg}. The j*^ partial 
parity symbol Cp- in such an extension is defined as a partial 
parity on some subset of / U {pi, . . . ,pj_i}. 

D. Graph Theory 

A graph Q ~ (V,i?,7i) consists of: 

• A finite non-empty set of vertices V. 

• A set of edges £, which is some subset of the pairs 
{{u, -u} : M, w G V, u 7^ v}. 

• A set of half-edges H, which is any subset of V. 

It is non-standard to define graphs with half-edges; however, 
as will be demonstrated in Section III-EI half-edges are useful 
in the context of graphical models for codes. A walk of length 
n in Q is a sequence of vertices ui, i'2, • • • , Wn, "^n+i in V such 
that {vi,Vi+i} G £ for alH G {1, . . . , n}. A path is a walk on 
distinct vertices while a cycle of length n is a walk such that 
vi through Vn are distinct and vi — Vn+i- Cycles of length 
n are often denoted n-cycles. A tree is a graph containing 
no cycles (i.e. a cycle-free graph). Two vertices u,v G V are 
adjacent if a single edge {u, v} G £ connects uto v. A graph 
is connected if any two of its vertices are linked by a walk. 
A cut in a connected graph Q is some subset of edges X C £ 
the removal of which yields a disconnected graph. Cuts thus 
partition the vertex set V. Finally, a graph is bipartite if its 
vertex set can be partitioned V = U U W, U D W = such 
that any edge in £ joins a vertex in U to one in W. 

E. Graphical Models of Codes 

Graphical models for codes have been described by a 
number of different authors using a wide variety of notation 
(e.g. [6], [7], [8], [9], [10], [11]). The present work uses the 
notation described below which was established by Forney in 
his Codes on Graphs papers [10], [30]. 

A linear behavioral realization of a linear code C C 
comprises three sets: 

• A set of visible (or symbol) variables G /} cor- 
responding to the codeword coordinate^ with alphabets 
{F^,zG/}. 

• A set of hidden (or state) variables {Si,i G Is} with 
alphabets {F^% j G /g}. 

• A set of linear local constraint codes {Ci,i G Ic}- 
Each visible variable is g-ary while the hidden variable Si 
with alphabet F^' is g'-^'l-ary. The hidden variable alphabet 
index sets {Ti,i G Is} are disjoint and unrelated to /. Each 
local constraint code C; involves a certain subset of the visible, 
Ivii) ^ I^ and hidden, Isii) C Is, variables and defines a 
subspace of the local configuration space: 




'Observe that this definition is slightly different than that proposed in 
[30] which permitted the use of g'"-ary visible variables corresponding to r 
codeword coordinates. By appropriately introducing equality constraints and 
g-ary hidden variables, it can be seen that these two definitions are essentially 
equivalent. 



Each local constraint code Ci thus has a well-defined block 
length 

n(C.) = |/v«l+ E l^^-l (8) 

and dimension k{Ci) = dimC^ over F^. Local constraints that 
involve only hidden variables are internal constraints while 
those involving visible variables are interface constraints. The 
full behavior of the realization is the set 58 of all visible and 
hidden variable configurations which simultaneously satisfy all 
local constraint codes: 

»C [ []Fn X ( n F^^ ) =F^x ( J] F^^ ). (9) 
\iei I \jeis ) \jeis I 

The projection of the linear code 58 onto / is precisely C. 

Forney demonstrated in [10] that it is sufficient to consider 
only those realizations in which all visible variables are 
involved in a single local constraint and all hidden variables 
are involved in two local constraints. Such normal realiza- 
tions have a natural graphical representation in which local 
constraints are represented by vertices, visible variables by 
half-edges and hidden variables by edges. The half-edge cor- 
responding to the visible variable Vi is incident on the vertex 
corresponding to the single local constraint which involves Vi. 
The edge corresponding to the hidden variable Sj is incident 
on the vertices corresponding to the two local constraints 
which involve Sj . The notation Qc and term graphical model is 
used throughout this work to denote both a normal realization 
of a code C and its associated graphical representation. 

It is assumed throughout that the graphical models con- 
sidered are connected. Equivalently, it is assumed throughout 
that the codes studied cannot be decomposed into Cartesian 
products of shorter codes [10]. Note that this restriction will 
apply only to the global code considered and not to the local 
constraints in a given graphical model. 

F. Tanner Graphs and Generalized Tanner Graphs 

The term Tanner graph has been used to describe different 
classes of graphical models by different authors. Tanner graphs 
denote those graphical models corresponding to parity-check 
matrices in this work. Specifically, let Hq be an x n(C) 
parity-check matrix for the code C over F, defined on the index 
set /. The Tanner graph corresponding to Hq contains th + 
n(C) local constraints of which n(C) are interface repetition 
constraints, one corresponding to each codeword coordinate, 
and rji are internal q-ary single parity-check constraints, one 
corresponding to each row of Re- An edge (hidden variable) 
connects a repetition constraint Ci to a single parity-check con- 
straint Cj if and only if the codeword coordinate corresponding 
to Ci is involved in the single parity-check equation defined 
by the row corresponding to Cj. A Tanner graph for C is 
redundant if it corresponds to a redundant parity-check matrix. 
A degree-g generalized Tanner graph for C is simply a Tanner 
graph corresponding to some degree-ij generalized extension 
of C in which the visible variables corresponding to the partial 
parity symbols have been removed. Generalized Tanner graphs 
have been studied previously in the literature under the rubric 
of generalized parity-check matrices [32], [33]. 
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III. A Complexity Measure for Graphical Models 

A. q'^-ary Graphical Models 

This work introduces the term q™-ary graphical model to 
denote a normal realization of a linear code C over Fg that 
satisfies the following constraints: 

• The alphabet index size of every hidden variable 
Si, i e Is, satisfies \Ti\ < m. 

• Every local constraint Ci, i G Ic, either satifies 



min {k{Ci), n{Ci) — k{Ci)) < m 



(10) 



or can be decomposed as a Cartesian product of 
codes, each of which satisfies this condition. 
The complexity measure rn simultaneously captures a cyclic 
graphical model analog of the familiar notions of state and 
branch complexity for trellises [27]. From the above definition, 
it is clear that Tanner graphs and generalized Tanner graphs 
for codes over ¥q are g-ary graphical models. The efficacy of 
this complexity measure is discussed further in Section IIV-DI 



B. Properties of q^^-ary Graphical Models 

The following three properties of ^"'-ary graphical models 
will be used in the proof of Theorem [5] in Section II VI 

1) Internal Local Constraint Involvement Property: Any 
hidden variable in a q^-ary graphical model can be 
made to be incident on an internal local constraint Ci 
which satisfies n{Ci) — k{Ci) < m without fundamen- 
tally altering the complexity or cyclic topology of that 
graphical model. 

2) Internal Local Constraint Removal Property: The re- 
moval of an internal local constraint from a (7™-ary 
graphical model results in a g™-ary graphical model for 
a new code defined on same index set. 

3) Internal Local Constraint Redefinition Property: Any 
internal local constraint Ci in a g™-ary graphical model 
satisfying n{Ci) — k{Ci) = m' < m can be equivalently 
represented by m' g-ary single parity-check equations 
over the visible variable index set. 

These properties, which are defined in detail in the appendix, 
are particularly useful in concert. Specifically, let Qc be a g™- 
ary graphical model for the linear code C over defined on an 
index set /. Suppose that the internal constraint C,. satisfying 
7i{Cr) — k{Cr) — m' < m is removed from Qc resulting in the 
new code C^''. Denote by C^}\ . . . ,C^™'' the set of m! g-ary 
single parity-check equations that result when Cr is redefined 
over /. A vector in F^ is a codeword in C if and only if it is 
contained in C^"" and satisfies each of these m! single parity- 
check equations so that 
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C. The Minimal Tree Complexity of a Code 

The minimal trellis complexity s (C) of a linear code C over 
Fg is defined as the base-g logarithm of the maximum hidden 
variable alphabet size in its minimal trellis [34]. Considerable 
attention has been paid to this quantity (cf. [34], [35], [36], 
[37], [38], [39]) as it is closely related to the important. 



and difficult, study of determining the minimum possible 
complexity of optimal SISO decoding of a given code. This 
work introduces the minimal tree complexity of a linear code 
as a generaUzation of minimal trellis complexity to arbitrary 
cycle-free graphical model topologies. 

Definition 1: The minimal tree complexity of a linear code 
C over Fg is the smallest integer t(C) such that there exists a 
cycle-free g*''')-ary graphical model for C. 

Much as s(C) = s(C^), the minimal tree complexity of a 
code C is equal to that of its dual. 

Proposition 1: Let C be a linear code over F^ with dual 
C-L. Then 

t{C)^t{C^). (12) 

Proof The dualizing procedure described by Forney [10] 
can be applied to a g*'^''^-ary graphical model for C in order 
to obtain a graphical model for which is readily shown to 
be g*('')-ary. □ 

Since a trellis is a cycle-free graphical model, t{C) < 
s (C), and all known upper bounds on s (C) extend to t{C). 
Specifically, consider the section of a minimal trellis for 
C illustrated in Figure [T] The hidden (state) variables have 



Si- 



Fig. I. The (j*('')-ary graphical model representation of a trellis section. 

alphabet sizes \Ti\ < s{C) and |T,-i| < s(C), respectively. 
The local constraint Ci has length 



n{C,) = \T,\ + \T,^i\ + l 



and dimension 



so that 



k{Ci) = 1 + min (iTil, iT^.i]) 



n(C,) - fc(C,) = max(|r,|, |r,_i|) < s{C). 



(13) 



(14) 



(15) 



Lower bounds on s (C) do not, however, necessarily extend to 
t (C). For example, the minimal trellis complexity of a length 
2™, dimension m + 1, binary first-order Reed Muller code 
is known to be m for m > 3 [36], whereas the conditionally 
cycle-free generalized Tanner graphs for these codes described 
in [40] have a natural interpretation as 2™~^-ary cycle-free 
graphical models. The Cut-Set Bound, however, precludes t{C) 
from being significantly smaller than s(C) [10], [30]. 

The following lemma concerning minimal tree complexity 
will be used in the proof of Theorem [3] in Section |IV] 

Lemma 1: Let C and C^^^ be linear codes over F^ defined 
on the index sets / and J C I, respectively, such that C^^^ is 
a g-ary single parity-check code. Define by C the intersection 
of C and C'^^'^: 

C^CnC^^"^. (16) 
The minimal tree complexity of C is upper-bounded by 

t{C) < t{C) + 1. (17) 
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Proof: By explicit construction of a (7*'^''^+^-ary graphical 
model for C. Let Qc be some q*'''^-ary cycle-free graphical 
model for C and let T he a minimal connected subtree 
of Qc containing the set of |J| interface constraints which 
involve the visible variables in J. Denote by Is{T) C Ig 
and Ic{T) C the subset of hidden variables and local 
constraints, respectively, contained in T. Choose some local 
constraint vertex Ca, A G /c(T), as a root for T. Observe that 
the choice of Ca, while arbitrary, induces a directionality in 
T; downstream toward the root vertex or upstream away from 
the root vertex. For every Si, i G Isi'^)^ denote by J^ i C J 
the subset of visible variables in J which are upstream from 
that hidden variable edge. _ 

A q'*^''^+'^-ary graphical model for C is then constructed 
from Qc by updating each hidden variable Si, i e Is{T), 
to also contain the g-ary partial parity of the upstream visible 
variables in | C J. The local constraints Cj, j £ /^(T) \ A, 
are updated accordingly. Finally, Ca is updated to enforce the 
g-ary single parity constraint defined by C^^'~^. This updating 
procedure increases the alphabet size of each hidden variable 
Si, i S Is{T), by at most one and adds at most one single 
parity-check (or repetition) constraint to the definition of each 
Cj, j G Ic{T), and the resulting cycle-free graphical model 
is thus at most g*''''+^-ary. □ 

The proof of Lemma [T] is detailed further by example in the 
appendix. 

IV. The Tradeoff Between Cyclic Topology and 
Complexity 

A. The Cut-Set and Square-Root Bounds 

Wiberg's Cut-Set Bound (CSB) [5], [6] is stated below 
without proof in the language of Section |ll] 

Theorem 1 ( Cut-Set Bound): Let C be a linear code over ¥g 
defined on the index set /. Let Qc be a graphical model for 
C containing a cut X corresponding to the hidden variables 
Si, i G Is{X), which partitions the index set into Ji C / 
and J2 C /. Let the base-q logarithm of the midpoint hidden 
variable alphabet size of the minimal two-section trellis for C 
on the two-section time axis {Ji, J2} be sx,min- The sum of 
the hidden variable alphabet sizes corresponding to the cut X 
is lower-bounded by 

> S;t.mi„. (18) 

The CSB provides insight into the tradeoff between cyclic 
topology and complexity in graphical models for codes and it 
is natural to explore its power to quantify this tradeoff. Two 
questions which arise for a given linear code C over in 
such an exploration are: 

1) For a given complexity m, how many cycles must be 
contained in a (/"'-ary graphical model for C? 

2) For a given number of cycles N, what is the smallest 
m such that a (/"'-ary model containing N cycles for C 
can exist? 

For a fixed cyclic topology, the CSB can be simultaneously 
applied to all cuts yielding a linear programming lower bound 
on the hidden variable alphabet sizes [5]. For the special case 



of a single-cycle graphical model (i.e. a tail-biting trellis), this 
technique yields a simple solution [28]: 

Theorem 2 (Square-Root Bound): Let C be a linear code 
over ¥g of even length and let Smid,min(C) be the base-g 
logarithm of the minimum possible hidden variable alphabet 
size of a conventional trellis for C at its midpoint over all 
coordinate orderings. The base-q logarithm of the minimum 
possible hidden variable alphabet size stb(C) of a tail-biting 
trellis for C is lower-bounded by 

stb(C) > '^mid.min 

(C)/2. (19) 

The square-root bound can thus be used to answer the 
questions posed above for a specific class of single-cycle 
graphical models. For topologies richer than a single cycle, 
however, the aforementioned linear programming technique 
quickly becomes intractable. Specifically, there are 

2"('^)-^ - 1 (20) 

ways to partition a size n{C) visible variable index set into 
two non-empty, disjoint, subsets. The number of cuts to be 
considered by the linear programming technique for a given 
cyclic topology thus grows exponentially with block length 
and a different minimal two-stage trellis must be constructed 
in order to bound the size of each of those cuts. 

B. Tree-Inducing Cuts 

Recall that a cut in a graph Q is some subset of the edges 
X C £ the removal of which yields a disconnected graph. A 
cut is thus defined without regard to the cyclic topology of 
the disconnected components which remain after its removal. 
In order to provide a characterization of the tradeoff between 
cyclic topology and complexity which is more precise than that 
provided by the CSB alone, this work focuses on a specific 
type of cut which is defined below. Two useful properties of 
such cuts are established by Propositions |2] and |3] 

Definition 2: Let Q he a connected graph. A tree-inducing 
cut is some subset of edges Xt C £ the removal of which 
yields a tree with precisely two components. 

Proposition 2: Let Q = {V,£,H) he a connected graph. 
The size Xt of any tree-inducing cut Xt in Q is precisely 

Xt = \£\^\V\+2. (21) 

Proof: It is well-known that a connected graph is a tree 
if and only if (cf. [41]) 

|^| = |V|-1. (22) 

Similarly, a graph composed of two cycle-free components 
satisfies 

|^| = |V|-2. (23) 

The result then follows from the observation that the size of 
a tree-inducing cut is the number of edges which must be 
removed in order to satisfy (l23l l. □ 
Proposition 3: Let Q he a connected graph with tree- 
inducing cut size Xt- The number of cycles Ng in Q is lower- 
bounded by 

^^^[^2)- ^^^^ 



6 



Proof: Let the removal of a tree-inducing cut Xt in the 
connected graph Q yield the cycle-free components Qi and Q2 
and let ei ^ ej G Xt- Since Qi (Q2) is a tree, there is a unique 
path in Qi {Q2) connecting and ej. There is thus a unique 
cycle in Q corresponding to the edge pair {ei,ej}. There are 
{'^2) ^^^^ distinct edge pairs which yields the lower bound. 
Note that this is a lower bound because for certain graphs, 
there can exist cycles which contain more than two edges from 
a tree-inducing cut. □ 

C. The Tree-Inducing Cut-Set Bound 

With tree-inducing cuts defined, the required properties of 
(7™-ary graphical models described and Lemma [T] established, 
the main result concerning the tradeoff between cyclic topol- 
ogy and graphical model complexity can now be stated and 
proved. 

Theorem 3: Let C be a linear code over ¥q defined on 
the index set / and suppose that Qc is a q™-ary graphical 
model for C with tree-inducing cut size X^- The minimal tree 
complexity of C is upper-bounded by 

t{C) < mXr. (25) 

Proof: By induction on Xt- Let Xt = 1 and suppose 
that e G Xt is the sole edge in some tree-inducing cut Xt 
in Qc- Since the removal of e partitions Qc into disconnected 
cycle-free components, Qc must be cycle-free and t{C) < m 
by construction. 

Now suppose that Xt = .t > 1 and let e G Xt be an edge in 
some tree-inducing cut Xt in Qc - By the first g™-ary graphical 
model property of Section ITll-BI e is incident on some internal 
local constraint Ci satisfying n{Ci)~k{Ci) = m! < m- Denote 
by Qc\i the g^-ary graphical model that results when Q is 
removed from Qc, and by C^* the corresponding code over 
/. The tree-inducing cut size of QQ\i is at most x — 1 since 
the removal of Ci from Qc results in the removal a single 
vertex and at least two edges. By the induction hypothesis, 
the minimal tree complexity of C^* is upper-bounded by 

i(C\') < m{x - 1). (26) 

From the discussion of Section IIIl-BI it is clear that Ci can 
be redefined as m' < m single parity check equations, C^^'^ 
for j G [1, m'], over on / such that 

c = c\''ncp'n---ncf"'^ (27) 

It follows from Lemma [T] that 

t{C) < t{C\') + m' < mx (28) 

completing the proof. □ 
An immediate corollary to Theorem |3] results when Propo- 
sition |3] is applied in conjunction with the main result: 

Corollary 1: Let C be a linear code over F,, with minimal 
tree complexity t{C)- The number of cycles Nm in any q™-?ay 
graphical model for C is lower-bounded by 

»„>(W7'»J). ,2„ 



D. Interpretation of the TI-CSB 

Provided t{C) is known or can be lower-bounded, the 
tree-inducing cut-set bound (Tl-CSB) (and more specifically 
Corollary [U can be used to answer the questions posed in 
Section IIV-AI The TI-CSB is further discussed below. 

1} The TI-CSB and the CSB: On the surface, the TI-CSB 
and the CSB are similar in statement; however, there are three 
important differences between the two. First, the CSB does 
not explicitly address the complexity of the local constraints 
on either side of a given cut. Forney provided a number of 
illustrative examples in [30] that stress the importance of 
characterizing graphical model complexity in terms of both 
hidden variable size and local constraint complexity. Second, 
the CSB does not explicitly address the cyclic topology of the 
graphical model that results when the edges in a cut are re- 
moved. The removal of a tree-inducing cut results in two cycle- 
free disconnected components and the size of a tree-inducing 
cut can thus be used to make statements about the complexity 
of optimal SISO decoding using variable conditioning in a 
cycUc graphical model (cf. [10], [40], [42], [43], [44], [45]). 
Finally, and most fundamentally, the TI-CSB addresses the 
aforementioned intractability of applying the CSB to graphical 
models with rich cyclic topologies. 

2) The TI-CSB and the Square-Root Bound: Theorem|3]can 
be used to make a statement similar to Theorem |2] which is 
valid for all graphical models containing a single cycle. 

Corollary 2: Let C be a linear code over F^ with minimal 
tree complexity t{C) and let mi be the smallest integer such 
that there exists a g^^-ary graphical model for C which 
contains at most one cycle. Then 

mi > t{C)/2. (30) 

More generally. Theorem [3] can be used to establish the 
following generalization of the square-root bound to graphical 
models with arbitrary cyclic topologies. 

Corollary 3: Let C be a linear code over Fg with minimal 
tree complexity t{C) and let m/r\ be the smallest integer such 

that there exists a q ^^'-aiy graphical model for C which 
contains at most (2) cycles. Then 

mg)>t(C)/r. (31) 

A linear interpretation of the logarithmic complexity state- 
ment of Corollary [3] yields the desired generalization of the 
square-root bound: an 7-*'*-root complexity reduction with re- 
spect to the minimal tree complexity requires the introduction 
of at least r{r — l)/2 cycles. 

There are few known examples of classical linear block 
codes which meet the square-root bound with equality. Shany 
and Be'ery proved that many RM codes cannot meet this 
bound under any bit ordering [46]. There does, however, exist 
a tail-biting trellis for the extended binary Golay code Cq 
which meets the square-root bound with equality so that [28] 

Smid,min (Cg) = 8 and stb(Cg)=4. (32) 

Given that this tail-biting trellis is a 2"'-ary single cycle 
graphical model for Cg, the minimal tree complexity of the 
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the extended binary Golay code can be upper-bounded by 
Corollary |2] as 

tiCa) < 8. (33) 

Note that the minimal bit-level conventional trellis for Cq 
contains (non-central) state variables with alphabet size 512 
and is thus a 2^-ary graphical model [35]. The proof of 
Lemma [1] provides a recipe for the construction of a 2*-ary 
cycle-free graphical model for Cg from its tail-biting trellis. It 
remains open as to where the minimal tree complexity of Cg 
is precisely 8, however. 

3) Aymptotics of the TI-CSB: Denote by N„i the minimum 
number of cycles in any (j"'-ary graphical model for a linear 
code C over ¥q with minimal tree complexity t{C). For large 
values of t{C)/m, the lower bound on established by 
Corollary [T] becomes 

([t{C)/m\\ ^ tier 
2 



2m2 



(34) 



The ratio of the minimal complexity of a cycle-free model for 
C to that of an g^-ary graphical model is thus upper-bounded 
by 



,t(C) 



(35) 



In order to further explore the asymptotics of the tree- 
inducing cut-set bound, consider a code of particular practical 
interest: the binary image C^5|f, of the [255, 223, 33] Reed- 
Solomon code Cfls- Since C^s is maximum distance separa- 
ble, a reasonable estimate for the minimal tree complexity of 
this code is obtained from Wolf's bound [47] 



Figure |2] plots N,n 



S{n{CB,s) - HCrs)) = 256. 



(36) 



as a function of m for Cjisir-, assuming 
. Note that since the complexity of the decoding algorithms 
implied by 2™-ary graphical models grow roughly as 2™, 
log TO is roughly a log log decoding complexity measure. 




log m 



Fig. 2. Minimum number of cycles required for 2'"-ary graphical models 
of the binary image of the [255, 223, 33] Reed-Solomon code. 



4) On Complexity Measures: Much as there are many valid 
complexity measures for conventional trellises, there are many 
reasonable metrics for the measurement of cyclic graphical 
model complexity. While there exists a unique minimal trellis 
for any linear block code which simultaneously minimizes all 
reasonable measures of complexity [48], even for the class 
cyclic graphical models with the most basic cyclic topology - 
tail-biting trellises - minimal models are not unique [29]. The 
complexity measure introduced by this work was motivated 
by the desire to have a metric which simultaneously captures 
hidden variable complexity and local constraint complexity 
thus disallowing local constraints from "hiding" complexity. 
There are many conceivable measures of local constraint 
complexity: one could upper-bound the state complexity of 
the local constraints or even their minimal tree complexity 
(thus defining minimal tree complexity recursively). The local 
constraint complexity measure used in this work is essentially 
Wolf's bound [47] and is thus a potentially conservative upper 
bound on any reasonable measure of local constraint decoding 
complexity. 

V. Graphical Model Transformation 

Let Qc be a graphical model for the linear code C over Fq. 
This work introduces eight basic graphical model operations 
the application of which to Qc results in a new graphical model 
forC: 

• The merging of two local constraints Ci^ and Ci^ into the 
new local constraint Ci which satisfies 



(37) 



The splitting of a local constraint Cj into two new local 
constraints Cj^ and Cj^ which satisfy 



(38) 



• The insertion/removal of a degree-2 repetition constraint. 

• The insertion/removal of a trival length 0, dimension 
local constraint. 

• The insertion/removal of an isolated partial parity-check 
constraint. 

Note that some of these operations have been introduced 
implicitly in this work and others already. For example, the 
proof of the local constraint involvement property of (/'"-ary 
graphical models presented in Section UlI-BI utilizes degree-2 
repetition constraint insertion. Local constraint merging has 
been considered by a number of authors under the rubric 
of clustering (e.g. [9], [10]). This work introduces the term 
merging specifically so that it can be contrasted with its inverse 
operation: splitting. Detailed definitions of each of the eight 
basic graphical model operations are given in the appendix. In 
this section, it is shown that these basic operations span the 
entire space of graphical models for C. 

Theorem 4: Let Qc and Qc be two graphical models for the 
linear code C over F^. Then Qc can be transformed into Qc 
via the application of a. finite number of basic graphical model 
operations. 

Proof: Define the following four sub-transformations 
which can be used to transform Qc into a Tanner graph Qj: 
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1) The transformation of Gc into a g-ary model Q^. 

2) The transformation of into a (possibly) redundant 
generalized Tanner graph Q^. 

3) The transformation of C?^' into a non-redundant general- 
ized Tanner graph Q^. 

4) The transformation of into a Tanner graph Qj. 

Since each basic graphical model operation has an inverse, 
can be transformed into Qc by inverting each of the four sub- 
transformations. In order to prove that Qc can be transformed 
into Qc via the application of a finite number of basic graphical 
model operations, it suffices to show that each of the four sub- 
transformations requires a finite number of operations and that 
the transformation of the Tanner graph Q^ into a Tanner graph 
Q^ corresponding to Qc requires a finite number of operations. 
This proof summary is illustrated in Figure [3] 

i 

Gc " Gc 

Fig. 3. The transformation of 5c into 5c via five sub-transformations. 

That each of the five sub-transformations from Qc to Q^ 
illustrated in Figure [3] requires only a finite number of basic 
graphical model operations is proved below. 

1) Qc ^ Gc- graphical model Qc is transformed into 
the g-ary model Q^ as follows. Each local constraint Cj in 
Qc is split into the n{Ci) — k{Ci) q-aiy single parity-check 
constraints which define it. A degree-2 repetition constraint is 
then inserted into every hidden variable with alphabet index 
set size m > 1 and these repetition constraints are then each 
split into 771 g-ary repetition constraints as illustrated in Figure 
m Each local constraint Cj in the resulting graphical model 
satisfies n{Cj) — k{Cj) = 1. Similarly, each hidden variable 
Sj in the resulting graphical model satisfies \Tj \ = 1. 




Fig. 4. Transformation of the g^-ary hidden variable Sj into g-ary hidden 
variables. 

2) Qc ^ Qc- A (possibly redundant) generalized Tanner 
graph is simply a bipartite g-ary graphical model with one 
vertex class corresponding to repetition constraints and one 
to single parity-check constraints in which visible variables 
are incident only on repetition constraints. By appropriately 
inserting degree-2 repetition constraints, the g-ary model Q'^ 
can be transformed into Q^. 

3) Qc Qc' Let the generalized Tanner graph Q^ cor- 
respond to an rn x n{C) + g redundant parity-check matrix 
H^J-^^ for a de gree-g generalized extension of C with rank 

rank(iJ^'''^') n{C) - k{C) + g. (39) 



A finite number of row operations can be applied to re- 
sulting in a new parity-check matrix the last th —TSLiik{H^^'^^) 
rows of which are all zero. Similarly, a finite number of basic 
operations can be applied to Q^ resulting in a generalized 
Tanner graph containing rn — rank(iJ^'^'^'' ) trivial constraints 
which can then be removed to yield Q^. Specifically, consider 
the row operation on H^'^'^^ which replaces a row hi by 

hi = hi + P-jhj (40) 

where j3j G F,. The graphical model transformation corre- 
sponding to this row operation first merges the g-ary single 
parity-check constraints C,; and Cj (which correspond to rows 
hi and hj, respectively) and then splits the resulting check 
into the constraints Ci and Cj (which correspond to rows hi 
and hj, respectively). Note that this procedure is valid since 

c, nCj=CinCj. (4i) 

4) Qc ^ Qc- Let the degree-!? generalized Tanner graph 
Q^ correspond to an n{C) — k{C) + .g x n{C) + g parity-check 
matrix H^^\ A degree- (g — 1) generalized Tanner graph Qc^^ 
is obtained from Q^ as follows. Denote by -f/^^' the parity- 
check matrix for the degree-g generalized extension defined 
by H^^'' which is systematic in the position corresponding 
to the g-th partial parity symbol. Since a finite number of 
row operations can be applied to H^^^ to yield H^^\ a finite 
number of local constraint merge and split operations can be 
be applied to Qc to yield the corresponding generalized Tanner 
graph Q^. Removing the now isolated partial -parity check 
constraint corresponding to the g-th partial parity symbol in 
Q^ yields the desired degiee-{g — 1) generalized Tanner graph 
Qc~^- By repeatedly applying this procedure, all partial parity 
symbols can be removed from Q^ resulting in Q^. 

5) Q^ Q^: Let the Tanner graphs Qc and Qc corre- 
spond to the parity-check matrices He and He, respectively. 
Since He can be transformed into He via a finite number of 
row operations, Q^ can be similarly transformed into Qj via 
the application of a finite number of local constraint merge 
and split operations. □ 

VI. Graphical Model Extraction via 
Transformation 

The set of basic model operations introduced in the previous 
section enables the space of all graphical models for a given 
code C to be searched, thus allowing for model extraction to 
be expressed as an optimization problem. The challenges of 
defining extraction as optimization are twofold. First, a cost 
measure on the space of graphical models must be found which 
is simultaneously meaningful in some real sense (e.g. highly 
correlated with decoding performance) and computationally 
tractable. Second, given that discrete optimization problems 
are in general very hard, heuristics for extraction must be 
found. In this section, heuristics are investigated for the 
extraction of graphical models for binary linear block codes 
from an initial Tanner graph. The cost measures considered 
are functions of the short cycle structure of graphical models. 
The use of such cost measures is motivated first by empirical 
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evidence concerning the detrimental effect of short cycles on 
decoding performance (cf. [6], [10], [11], [12], [13], [14], [15], 
[16]) and second by the existence of an efficient algorithm 
for counting short cycles in bipartite graphs [16]. Simulation 
results for the models extracted via these heuristics for a 
number of extended BCH codes are presented and discussed 
in Section lyTD] 

A. A Greedy Heuristic for Tanner Graph Extraction 

The Tanner graphs corresponding to many linear block 
codes of practical interest necessarily contain many short 
cycles [26]. Suppose that any Tanner graph for a given code 
C must have girth at least gmin(C); an interesting problem 
is the extraction of a Tanner graph for C containing the 
smallest number of gniin(C)-cycles. The extraction of such 
Tanner graphs is especially useful in the context of ad-hoc 
decoding algorithms which utilize Tanner graphs such as Jiang 
and Narayanan's stochastic shifting based iterative decoding 
algorithm for cyclic codes [49] and the random redundant 
iterative decoding algorithm presented in [50]. 

Algorithm [T] performs a greedy search for a Tanner graph 
for C with girth .gmin(C) and the smallest number of .gmin(C)- 
cycles starting with an initial Tanner graph TG(i7c) which 
corresponds to some binary parity-check matrix Hq- Define 
an (i, j)-row operation as the replacement of row hj in He by 
the binary sum of rows hi and hj . As detailed in the proof of 
TheoremlH if Ci and Cj are the single parity-check constraints 
in TG{Hc) corresponding to hi and hj, respectively, then an 
(i, j)-row operation in He is equivalent to merging d and Cj 
to form a new constraint Ci,j = Ci D Cj and then splitting dj 
into Ci and Cj (where Cj enforces the binary sum of rows hi 
and hj). Algorithm [T| iteratively finds the rows hi and hj in 
He with corresponding (i, j)-row operation that results in the 
largest short cycle reduction in TG(iJc) at every step. This 
greedy search continues until there are no more row operations 
that improve the short cycle structure of TG(iJc). 

B. A Greedy Heuristic for Generalized Tanner Graph Extrac- 
tion 

A number of authors have studied the extraction of gener- 
alized Tanner graphs (GTGs) of codes for which ,9min(C) = 4 
with a particular focus on models which are 4-cycle-free and 
which correspond to generalized code extensions of minimal 
degree [51], [52]. Minimal degree extensions are sought be- 
cause no information is available to the decoder about the 
partial parity symbols in a generalized Tanner graph and the 
introduction of too many such symbols has been observed 
empirically to adversely affect decoding performance [52]. 

Generalized Tanner graph extraction algorithms proceed via 
the insertion of partial parity symbols, an operation which is 
most readily described as a parity-check matrix manipulatiord 
Following the notation introduced in Section Hm suppose that 
a partial parity on the coordinates indexed by 



J C/U{pi,P2,---,Pg} 



(42) 



^Note that pai'tial parity insertion can also be viewed through the lens 
of graphical model transformation. The insertion of partial parity symbol 
proceeds via the insertion of an isolated partial parity check followed by 
a series of local constraint merge and split operations. 



Input: rn x n{C) binary parity-check matrix He- 
Output: vh X n{C) binary parity-check matrix i/^. 

He; i* < 1; j* < 1; g* ^ girth of TG (H'^); 

number of g*-cycles in TG (-ff^)' 
^ number of g* + 2-cycles in TG (i/^); 
repeat 

if i* ^ j* then Replace row hj* in iJ^ with binary 



Hi^ 



sum of rows hi* and hj* ; 

i* < 1; j* < 1; 

for i,j ^0,...,rH-l, i ^ j do 

Replace row hj in with binary sum of rows 
hi and hj; 

girth of TG {H^y, 



9 
N, 



number of g-cycles in TG {H'^,); 
— number of g + 2-cycles in TG {Hl^) 



3+2 

if g > g* then 

g* ^ .9; 



i; r 



J 



9 

TV* ' 

a 



AND TV, 
-TV, 



g 

TV* 



< TV* then 



TV„ 



end 
until i* 



return He 



end 

else if g 

j* j' g ^ ^ig-! -"g+2 " ^"g 

else if 5 = g* AND Ng = TV* then 

g+2 
Na+2; 

end 

Undo row replacement; 



if TVg+2 < TV„*^, then 



i;f 



-1 & r 



-1; 



Algorithm 1: Greedy heuristic for the reduction of short 
cycles in Tanner graphs for binary codes. 



is to be introduced to a GTG for C corresponding to a degree- 
g generalized extension C with parity-check matrix H^. A 
row hp is first appended to H^ with a 1 in the positions 
corresponding to coordinates indexed by J and a in the other 
positions. A column is then appended to H^ with a 1 only in 
the position corresponding to hp. The resulting parity-check 
matrix H^ describes a degree-g + 1 generalized extension C. 
Every row hi ^ hp in H^ which contains a 1 in all of the 
positions corresponding to coordinates indexed by J is then 
replaced by the binary sum of hi and hp. Suppose that there 
are r( J) such rows. It is readily verified that the tree-inducing 
cut size Xt of the GTG that results from this insertion is 
related to that of the initial GTG, Xt, by 



AXt = Xt-Xt = (I J| - l)(r'(J) - 1). 



(43) 



Algorithm [3] performs a greedy search for a 4-cycle-free 
generalized Tanner graph for C with the smallest number of 
inserted partial parity symbols starting with an initial Tanner 
graph TG{He) which corresponds to some binary parity- 
check matrix He. Algorithm[3]iteratively finds the symbol sub- 
sets that result in the largest tree-inducing cut size reduction 
and then introduces the partial parity symbol corresponding to 
one of those subsets. At each step, Algorithm[3]uses Algorithm 
|2]to generate a candidate list of partial parity symbols to insert 
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and chooses from that list the symbol which reduces the most 
short cycles when inserted. This greedy procedure continues 
until the generalized Tanner graph contains no 4-cycles. 

Algorithm |3] is closely related to the GTG extraction 
heuristics proposed by Sankaranarayanan and Vasic in [51] 
and Kumar and Milenkovic in [52] (henceforth referred to 
as the SV and KM heuristics, respectively). It is readily 
shown that Algorithm [3] is guaranteed to terminate using the 
proof technique of [51]. The SV heuristic considers only the 
insertion of partial parity symbols corresponding to coordinate 
index sets of size 2 (i.e. \J\ = 2). The KM heuristic considers 
only the insertion of partial parity symbols corresponding 
to coordinate index sets satisfying r( J) = 2. Algorithm |2] 
however, considers all coordinate index sets satisfying \J\ = 
2, 3, 4 and r( J) = 2, 3, 4 and then uses (1431 ) to evaluate which 
of these coordinate sets results in the largest tree-inducing 
cut size reduction. Algorithm [3] is thus able to extract GTGs 
corresponding to generalized extensions of smaller degree 
than the SV and KM heuristics. In order to illustrate this 
observation, the degrees of the generalized code extensions 
that result when the SV, KM and proposed (HC) heuristics are 
applied to parity-check matrices for three codes are provided 
in Table U Figure [5] compares the performance of the three 
extracted GTG decoding algorithms for the [31,21,5] BCH 
code in order to illustrate the efficacy of extracting GTGs 
corresponding to extensions of smallest possible degree. 



Code 


SV 


KM 


HC 


[23, 12, 7\ Golay 


18 


11 


10 


[31,21,5] BCH 


47 


19 


12 


[63, 30, 13] BCH 


264 


121 


69 



TABLE I 

Generalized code extension degrees corresponding to the 
4-cycle-free gtgs obtained via the sv, km, and hc heuristics. 
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Fig. 5. Bit error rate performance of three GTG decoding algorithms for 
the [31, 21, 5] BCH code. One-hundred iterations of a flooding schedule were 
performed. Binary antipodal signaling over an AWGN channel is assumed. 



Input: Binary generalized parity-check matrix H^. 
Output: List S of best partial parity symbols sets. 

5^0; AX* ^ 0; 

// Consider pairs of columns of H^. 
J2 ^ coordinate pair that maximizes r{J2); 
Append J2 to S; AX^ ^ r(J2) - 1; 

// Consider 3-tuples of columns of H^. 
J3 ^ coordinate 3-tuple that maximizes ^(Js); 
if 2(r( J3) - 1) > AX* then 
I 5^0; Append J3 to S; AX* <- 2(r(J3) - 1); 
end 

else if 2(r( J3) - 1) = AX|. then Append J3 to S; 

II Consider 4-tuples of columns of H^. 
J4 ^ coordinate 4-tuple that maximizes r{J^; 
if 3(r( J4) - 1) > AX* then 
I 5^0; Append J4 to S\ AX^ ^ 3(r(J4) - 1); 
end 

else if 3(r( J4) — 1) = AX^ then Append J4 to S\ 

II Consider pairs of rows of i?^. 
Ji ^ largest coordinate subset such that r{ Ji) = 2; 
if I J^l - 1 > AX*T, then 
I 5^0; Append J, to S\ AX^ ^\Ji\- 1; 
end 

else if |Ji| — 1 = AXt^ then Append Ji to S\ 

II Consider 3-tuples of rows of H^. 
Jj ^ largest coordinate subset such that r{Jj) ~ 3; 
if 2{\Jj \ - 1) > AX* then 
I .5^0; Append Jj to S; AX* ^ 2(|J,| - 1); 
end 

else if 2(1 J, I - 1) = AX^ then Append to <S; 

// Consider 4-tuples of rows of H^. 
Jk ^ largest coordinate subset such that r{Jk) ~ 4; 
if 3(|Jfc| - 1) > AX* then 
I 5 ^ 0; Append Jfc to 5; AX^. ^ 3(|Jfe| - 1); 
end 

else if 3(1 Jfcl - 1) = AX|, then Append Jk to S; 
return S 

Algorithm 2: Heuristic for generating candidate partial parity 
symbols. 

Input: Binary parity-check matrix He ■ 

Output: Binary generalized parity-check matrix H^. 

He; 

while GTG(iJ^) contains A-cycles do 

S ^ set of candidate partial parity symbol 

subsets from Algorithmic 
J* ^ subset in S the insertion of which reduces 

the most 4-cycles in GTG(i7^); 
Insert symbol corresponding to J* in H^; 

end 

return H^ 

Algorithm 3: Greedy heuristic for the removal of 4-cycles 
in binary generalized Tanner graphs. 
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C. A Greedy Heuristic for 2™-ary Model Extraction 

For most codes, the decoding algorithms impHed by gener- 
alized Tanner graphs exhibit only modest gains with respect 
to those implied by Tanner graphs, if any, thus motivating 
the search for more complex graphical models. Algorithm |4] 
iteratively applies the constraint merging operation in order to 
obtain a 2™ -ary graphical model from an initial Tanner graph 
TG (He) for some prescribed maximum complexity m*. At 
each step. Algorithm |4] determines the pair of local constraints 
d and Cj which when merged reduces the most short cycles 
without violating the maximum complexity constraint m*. In 
order to ensure that that the efficient cycle counting algorithm 
of [16] can be utilized, only pairs of checks which are both 
internal or both interface are merged at each step. Since the 
initial Tanner graph is bipartite with vertex classes correspond- 
ing to interface (repetition) and internal (single parity-check) 
constraints, the graphical models that result from every such 
local constraint merge operations are similarly bipartite. 

Input: Tanner graph TG{Hc)- Max. complexity m*. 
Output: 2'" -ary graphical model GM for C. 

GM ^ TGiHc); 
repeat 

{Ci ,Cj) <— pair of incident or internal constraints 
the removal of which removes the most 
4-cycles from GM while not violating 
the 2™ -ary complexity constraint; 
Merge local constraints d and Cj in GM; 
until No allowed A-cycle reducing merge operations 
remain; 
return GM 

Algorithm 4: Greedy heuristic for the extraction of 2'"-ary 
graphical models. 

D. Simulation Results 

The proposed extraction heuristics were applied to two ex- 
tended BCH codes with parameters [32,21,6] and [64,51,6], 
respectively. In both Figures |6] and [7] the performance of a 
number of suboptimal SISO decoding algorithms for these 
codes is compared to algebraic hard-in hard-out (HIHO) 
decoding and optimal trellis SISO decoding. Binary antipodal 
signaling over AWGN channels is assumed throughout. 

Initial parity-check matrices H were formed by extending 
cyclic parity-check matrices for the respective [31,21,5] and 
[63, 51, 5] BCH codes [31]. These initial parity-check matrices 
were used as inputs to Algorithm [T] yielding the parity-check 
matrices H' , which in turn were used as inputs to Algorithm 
[3] yielding 4-cycle-free generalized Tanner graphs. The subop- 
timal decoding algorithms implied by these graphical models 
are labeled TG{H), TG{H'), and GTG(i7'), respectively. 
The generalized Tanner graphs extracted for the [32,21,6] 
and [64,51,6] codes correspond to degree-17 and degree-40 
generalized extensions, respectively. Finally, the parity-check 
matrices H' were used as inputs to Algorithm |4] with various 
values of to*. The number 4-, 6-, and 8-cycles (iV4, Nq, Ng) 
contained in the extracted graphical models for the [32, 21, 6] 
and [64, 51, 6] codes are given in Tables HH and Hill respectively. 




Eb/No (dB) 

Fig. 6. Bit error rate performance of different decoding algorithms for tlie 
[32, 21, 6] extended BCH code. Fifty iterations of a flooding scliedule were 
performed for all of the suboptimal SISO decoding algorithms. 





Af4 


Ne 


Ns 


TG{H) 


1128 


37404 


1126372 


TG{H') 


453 


11152 


260170 


GTG{H') 





62 


298 


4-ary GM 


244 


3852 


50207 


16-ary GM 


70 


340 


724 



TABLE II 

Short cycle structure of the initial and extracted graphical 
models for the [32, 21, 6] extended bch code. 



The utility of Algorithm[T|is illustrated in both Figures|6]and 
|7] the TG{H') algorithms outperform the TG{H) algorithms 
by approximately 0.1 dB and 0.5 dB at a bit error rate (BER) 
of 10^'* for the [32,21,6] and [64,51,6] codes, respectively. 
For both codes, the 4-cycle-free generalized Tanner graph 
decoding algorithms outperform Tanner graph decoding by 
approximately 0.2 dB at a BER of 10^''. Further performance 
improvements are achieved for both codes by going beyond 
binary models. Specifically, at a BER of 10"'', the suboptimal 
SISO decoding algorithm implied by the extracted 16-ary 
graphical model for the [32, 21, 6] code outperforms algebraic 
HIHO decoding by approximately 1.5 dB. The minimal trellis 
for this code is known to contain state variables with alphabet 
size at least 1024 [34], yet the 16-ary suboptimal SISO decoder 
performs only 0.7 dB worse at a BER of 10~^. At a BER of 
10^'*, the suboptimal SISO decoding algorithm implied by 
the extracted 32-ary graphical model for the [64,51,6] code 
outperforms algebraic HIHO decoding by approximately 1.2 
dB. The minimal trellis for this code is known to contain state 
variables with alphabet size at least 4096 [34]; that a 32-ary 
suboptimal SISO decoder loses only 0.7 dB with respect to 
the optimal SISO decoder at a BER of 10^^ is notable. 
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Fig. 7. Bit error rate performance of different decoding algoritlims for tlie 
[64,51,6] extended BCH code. Fifty iterations of a flooding scliedule were 
performed for all of the suboptimal SISO decoding algorithms. 





Af4 




Ns 


TG{H) 


9827 


1057248 


111375740 


TG{H') 


3797 


270554 


19374579 


GTG(H') 





163 


1229 


8-ary GM 


847 


19590 


304416 


32-ary GM 


201 


1384 






TABLE III 

Short cycle structure of the initial and extracted graphical 
models for the [64, 51, 6] extended bch code. 



VII. Conclusion and Future Work 

This work studied the space of graphical models for a 
given code in order to lay out some of the foundations of 
the theory of extractive graphical modeling problems. The 
primary contributions of this work were the introduction of a 
new bound characterizing the tradeoff between cyclic topology 
and complexity in graphical models for linear codes and the 
introduction of a set of basic graphical model transformation 
operations which were shown to span the space of all graphical 
models for a given code. It was demonstrated that these 
operations can be used to extract novel cyclic graphical models 
- and thus novel suboptimal iterative soft-in soft-out (SISO) 
decoding algorithms - for linear block codes. 

There are a number of interesting directions for future work 
motivated by the statement of the tree-inducing cut-set bound 
(TI-CSB). While the minimal trellis complexity s{C) of linear 
codes is well-understood, less is known about minimal tree 
complexity t{C) and characterizing those codes for which 
t(C) < s(C) is an open problem. A particularly interesting 
open problem is the use of the Cut-Set Bound to establish an 
upper bound on the difference between s(C) and t{C); such 
a bound would allow for a re-expression of the TI-CSB in 
terms of the more familiar minimal trellis complexity. A study 
of those codes which meet or approach the TI-CSB is also an 



interesting direction for future work which may provide insight 
into construction techniques for good codes with short block 
lengths (e.g. 10s to 100s of bits) defined on graphs with a 
few cycles (e.g. 3, 6 or 10). The development of statements 
similar to the TI-CSB for alternative measures of graphical 
model complexity and for graphical models of more general 
systems (e.g. group codes, nonlinear codes) is also interesting. 

There are also a number of interesting directions for future 
work motivated by the study of graphical model transforma- 
tion. While the extracted graphical models presented in Sec- 
tion IVI-DI are notable, ad-hoc techniques utilizing massively 
redundant models and judicious message filtering outperform 
the models presented in this work [49], [50]. Such massively 
redundant models contain many more short cycles than the 
models presented in Section [VI-DI indicating that short cycle 
structure alone is not a sufficiently meaningful cost measure 
for graphical model extraction. It is known that redundancy 
can be used to remove pseudocodewords (cf. [53]) thus 
motivating the study of cost measures which consider both 
short cycle structure and pseudocodeword spectrum. Finally, 
it would be interesting to study extraction heuristics beyond 
simple greedy searches, as well as those which use all of the 
basic graphical model operations (rather than just constraint 
merging). 

Appendix 

This appendix provides detailed definitions of both the (7™- 
ary graphical model properties described in Section IIII-BI and 
the basic graphical model operations introduced in Section IV] 
The proof of Lemma [1] is also further illustrated by example. 
In order to elucidate these properties and definitions, a single- 
cycle graphical model for the extended Hamming code is 
studied throughout. 

A. Single-Cycle Model for the Extended Hamming Code 

Figure [8] illustrates a single-cycle graphical model (i.e. a 
tail-biting trellis) for the length 8 extended Hamming code 
Ch- The hidden variables and 6*5 are binary while 5*2, 
5*3, iS'4, Sq, St, and S% are 4-ary. All of the local constraint 
codes in this model are interface constraints. Equations ( l44l i- 
(l47b define the local constraint codes via generator matrices 
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C5 










^3 



Fig. 8. Tail-biting trellis graphical model for the length 8 extended Hamming 
code Ch- 
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(where Gi generates Ci): 



G. = 



G5 



G7 



Si Vi S2 

"1 10 

1 01 

S3 V3 S4 
10 01 

01 11 
00 1 01 

"1 10 

1 01 

S7 V7 Ss 

10 01 

01 11 
00 1 01 



G2 = 



G4 = 



S2 V2 5*3 

10 1 10 
01 1 01 

'10 1 0" 
01 1 1 



trivial from a decoding complexity viewpoint. Furthermore, 
the insertion of Si and Ci does not fundamentally alter the 
cyclic topology of Qc since no new cycles can be introduced 

(44) by this procedure. 

As an example, consider the binary hidden variable Si in 
Figure |8] which is incident on the interface constraints Ci and 
Cs- By introducing the new binary hidden variable and 
binary repetition constraint Cg, as illustrated in Figure |9] S'l 

(45) can be made to be incident on the internal constraint Cg. The 







Vi 




T . 


1 S2 


Ss S7 


Cs - 


— c, 



Gfi — 



Gs — 



10 1 10 
01 1 01 

Ss Vs Si 

10 1 0" 
01 1 1 



(46) 



(47) 



The graphical model for Ch illustrated in Figure [8] is 4- 
ary (i.e. q = 2, m = 2): the maximum hidden variable 
alphabet index set size is 2 and all local constraints satisfy 
rain {k{C^),n{C^) ~ k{C,)) < 2. The behavior, ^h, of this 
graphical model is generated by 

Si Vi S2 V2 S3 V3 Si Vi Ss Sg Ve S7 Vr Ss Vg 

1 01 1 01 1 10 1 00 00 00 

00 00 1 01 1 1 10 1 10 1 00 

00 00 00 1 01 1 01 1 10 1 

1 10 1 10 1 00 00 00 1 01 1 



G 



(48) 

The projection of onto the visible variable index set /, 
05 //|/, is thus generated by 

Vi V2 V3 Vi Ve V7 Vg 

1 1 1 1 0' 

110 110 

1 1 1 1 

1 1 1 1 



G 



(49) 



which coincides precisely with a generator matrix for Ch- 

B. q™-ary Graphical Model Properties 

The three properties of q^-ary graphical models introduced 
in Section IIII-BI are discussed in detail in the following where 
it is assumed that a ^"'-ary graphical model Qc with behavior 
03 for a linear code C over ¥q defined on an index set / is 
given. 

1) Internal Local Constraint Involvement Property: Sup- 
pose there exists some hidden variable Sj (involved in the 
local constraints Cj^ and Cj^) that does not satisfy the local 
constraint involvement property. A new hidden variable Si 
that is a copy of Sj is introduced to Qc by first redefining 
over Si and then inserting a local repetition constraint 
Ci that enforces Sj = Si- The insertion of Si and C,; does 
not fundamentally alter the complexity of Qc since n{Ci) — 
k{Ci) = \Tj\ < m and since degree- 2 repetition constraints are 



Vi 





Ss 




Si 




Cs 




c., 




Ci 







Fig. 9. Insertion of hidden variable Sq and internal local constraint C<j into 
the tail-biting trellis for Ch- 



insertion of 5*9 and Cg redefines Cg over 5*9 resulting in the 
generator matrices 



Gs — 



Sg Vs S9 

10 1 0' 
01 1 1 



Sq Si 



G9 = 1 1 



(50) 



Clearly, the modified local constraints Cg and Cg satisfy the 
condition for inclusion in a 4-ary graphical model. 

2) Internal Local Constraint Removal Property: The re- 
moval of the internal constraint Cr from Qc in order to define 
the new code C^*" proceeds as follows. Each hidden variable 
Si, i £ Is{r), is first disconnected from Cr and connected to 
a new degree- 1 internal constraint d' which does not impose 
any constraint on the value of Si (since it is degree-1). The 
local constraint Cr is then removed from the resulting graphical 
model yielding Qc\r with behavior 5B^''. The new code C^'' IS 
the projection of QS^*" onto /. 

As an example, consider the removal of the internal local 
constraint Cg from the graphical model for Ch described 
above; the resulting graphical model update is illustrated in 
Figure [To] The new codes Cio and Cn are length 1, dimension 
1 codes which thus impose no constraints on S'l and Sq, 
respectively. It is readily verified that the code C^ which 
results from the removal of Cg from Ch has dimension 5 and 
is generated by 

Vi V2 V3 Vi Vi Ve V7 Va 

1 1 1 1 0' 

110 110 

1 1 1 1. (51) 

1 1 1 1 

10 10 110 



Gh\9 — 



Note that c)^ corresponds to all paths in the tail-biting trellis 
representation of Ch, not just those paths which begin and end 
in the same state. 
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Vi 
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S2 







Ci 






Si 


C 


1 







S2 



Fig. 10. Removal of internal local constraint Cg from the tail-biting trellis 
for Ch- 



The removal of an internal local constraint Cr results in 
the introduction of |/s('')| new degree-1 local constraints. 
Forney described such constraints as "useless" in [30] and 
they can indeed be removed from Q(.\r since they impose no 
constraints on the variables they involve. Specifically, for each 
hidden variable Si, i G Isif), involved in the (removed) local 
constraint Cr, denote by Cj the other constraint involving Si in 
Qc- The constraint Cj can be redefined as its projection onto 
IvU) U {IsU) \ It is readily verified that the resulting 
constraint Cj satisfies the condition for inclusion in a g™-ary 
graphical model. 

Continuing with the above example, Ciq, Ch, Si, and 5g 
can be removed from the graphical model illustrated in Figure 
[To] by redefining Ci and Cs with generator matrices 



Vi S2 

1 01 
10 



10 1 
01 1 



(52) 



3) Internal Local Constraint Redefinition Property: Let 
satisfy n{Ci)—k{Ci) = m' < m and consider a hidden variable 
Sj involved in Ct (i.e. j G /s(i)) with alphabet index set Tj. 
Each of the \Tj\ coordinates of Sj can be redefined as a q- 
ary sum of some subset of the visible variable set as follows. 
Consider the behavior 05^' and corresponding code C^' which 
result when Ci is removed from Qc (before Sj is discarded). 
The projection of 03^* onto Tj U /, ^"^^^^jj, has length 

n{C) + \T,\ 

and dimension 



(53) 



k{C\') > k{C) 



(54) 



over ¥q. There exists a generator matrix for *B| j, that is 
systematic in some size fc(C\*) subset of the index set / [31]. A 
parity-check matrix Hj that is systematic in the \Tj \ positions 
corresponding to the coordinates of Sj can thus be found for 
this projection; each coordinate of Sj is defined as a q-ary 
sum of some subset of the visible variables by Hj. Following 
this procedure, the internal local constraint Ci is redefined 
over / by substituting the definitions of Sj implied by Hj for 
each j G /s(j) into each of the m' g-ary single parity-check 
equations which determine Ci. 

Returning to the example of the tail-biting trellis for Ch, 
the internal local constraint Cg is redefined over the visible 



variable set as follows. The projection of onto Ti U / is 
generated by 



Si Vi V2 V3 Vi Vs Ve V7 Vg 

1 1 1 1 0' 

1 1 1 1 

1 1 1 1. (55) 

1 1 1 1 1 
110 10 110 



G„\9 

HlTiUI 



A valid parity-check matrix for this projection which is 
systematic in the position corresponding to Si is 



Hi 



Si Vi V2 V3 Vi V5 Vs V7 Vs 

"1 1 1 0' 

1 1 1 1 

110 110 

1 1 1 1 



which defines the binary hidden variable 5*1 as 

Si = Vi+ V2 



(56) 



(57) 



where addition is over F2. A similar development defines the 
binary hidden variable Sg as 



S9 = V5 + Vs. 



(58) 



The local constraint Cg thus can be redefined to enforce the 
single parity-check equation 



Vi+V2 + V5 + Vs= 0. 



(59) 



Finally, in order to illustrate the use of the g™-ary graphical 
model properties in concert, denote by Cg^^ the single parity- 
check constraint enforcing ( |59] |. It is readily verified that only 
the first four rows of G^\9 (as defined in (ISTl i) satisfy Cg^"*. It 
is precisely these four rows which generate Ch proving that 



Ch = C)^ n Cg 



(1) 



(60) 



C. Illustration of Proof of Lemma |7] 

In the following, the proof of Lemma [T| is illustrated by 
updating a cycle-free model for C^ (as generated by ( |5TT i) 
with the single parity-check constraint defined by ( |59] l in order 
to obtain a cycle-free graphical model for Ch- A cycle-free 
binary graphical model for c)^ is illustrated in Figure 
All hidden variables in Figure [TT] are binary and the local 
constraints labeled C14, C17, C20, and C23 are binary single 
parity-check constraints while the remaining local constraints 
are repetition codes. By construction, it has thus been shown 
that 

i(c)f) = 1. (61) 



In light of (|59] | and ( l60b . a 4-ary graphical model for Ch 
can be constructed by updating the graphical model illustrated 
in Figure [TT] to enforce a single parity-check constraint on Vi, 
V2, V5, and Vg. A natural choice for the root of the minimal 
spanning tree containing the interface constraints incident on 

^In order to emphasize that the code and hidden variable labels in Figure 
II H are in no way related to those labels used previously, the labeling of hidden 
variables and local constraints begin at Si 2 and C12, respectively. 
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Fig. 11. Cycle-free binary graphical model for c'^ . The minimal spanning 
tree containing the interface constraints which involve Vi, V2, V5, and Vg, 
respectively, is highlighted. 



these variables is C24. The updating of the local constraints and 
hidden variables contained in this spanning tree proceeds as 
follows. First note that since C12, C15, Cis, and C22 simply 
enforce equality, neither these constraints, nor the hidden 
variables incident on these constraints, need updating. The 
hidden variables 5i4, S'17, 5*20, and are updated to be 
4-ary so that they send downstream to C24 the values of Vi, 
V2, Vs, and V5, respectively. These hidden variable updates are 
accomplished by redefining the local constraints C14, Cn, C20, 
and C23; the respective generator matrices for the redefined 
codes are 



G 



14 



G2 



S12 Sl3 Sl4 

1 11 

1 10 

S18 Sio S20 

1 11 
1 10 



Gn 



G 



23 



Sl5 SlG S'17 

1 11 

1 10 

S21 S'22 S23 

1 10 
1 11 



(62) 



(63) 



Finally, C24 is updated to enforce both the original repetition 
constraint on the respective first coordinates of 5*14, S'17, 'S'20, 
and 523 and the additional single parity-check constraint on 
Vi, V2, V5, and Vs (which correspond to the respective second 
coordinates of 5*14, 5i7, ^20, and 523). The generator matrix 
for the redefined C24 is 



G2 



5l4 Sl7 S20 S23 

10 10 10 10 

01 00 00 01 

00 01 00 01 

00 00 01 01 



(64) 



The updated constraints all satisfy the condition for in- 
clusion in a 4-ary graphical model. Specifically, C24 can be 
decomposed into the Cartesian product of a length 4 binary 
repetition code and a length 4 binary single parity-check code. 



The updated graphical model is 4-ary and it has thus been 
shown by construction that 



tiCn) < t{c)^) + 1 = 2. 



(65) 



D. Graphical Model Transformations 

The eight basic graphical model operations introduced in 
Section |V] are discussed in detail in the following where it is 
assumed that a (/'"-ary graphical model Qc with behavior !B 
for a linear code C over Fg defined on an index set / is given. 

7 ) Local Constraint Merging: Suppose that two local con- 
straints and are to be merged. Without loss of gener- 
ality, assume that there is no hidden variable incident on both 
and (since if there is, a degree-2 repetition constraint 
can be inserted). The hidden variables incident on Cj^ may be 
partitioned into two sets 

/5(^l) = 4^^(^l)u4"^^(^l) (66) 

where each Sj, j € is also incident on a constraint 

Cj which is adjacent to Ci^ - The hidden variables incident on 
may be similarly partitioned. The set of local constraints 
incident on hidden variables in both and I^g\i2) 

are denoted common constraints and indexed by I^\ii,i2)- 
Figure [T2] illustrates this notation. 






Fig. 12. Local constraint merging notation. The local constraints Cci and 



The merging of local constraints Ci^ and Ci^ proceeds as 
follows. For each common local constraint Cj, j G Iq"^ (ii, 12), 
denote by Sj^ (Sj^) the hidden variable incident on Cj 
and (Cij). Denote by CjHj^ j^} the projection of Cj 
onto the two variable index set {ji,j2} and define a new 
qr'=(C3i{3i,i2}).ai-y hidden variable Sj^.j^ which encapsulates the 
possible simultaneous values of Sj-^ and Sj^ (as constrained by 
^j|{ji j2})- After defining such hidden variables for each Cj, 
j G I^\ii,i2), a set of new hidden variables results which is 
indexed by Z^'^' (ii, ^2)- The local constraints C^j and Ci.^ are 
then merged by replacing C^j and Ci^ by a code defined over 

'(*i)u4"^'fe)u4'=^(*i,*2) (67) 



Iv{h)Ulv{i2)urs 



which is equivalent to C^j n Ci^ and redefining each local 
constraint C,, 7 G Ic{'ii,i2), over the appropriate hidden 



variables in I^s\ii,i2) 



As an example, consider again the 4-ary cycle-free graphical 
model for Ch derived in the previous section, a portion of 
which is re-illustrated on the bottom left of Figure [T3] and 
suppose that the local constraints C14 and C17 are to be merged. 
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The local constraints C14, C17, and C24 are defined by ( |62] i and 
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Fig. 13. The merging of constraints C14 and C17 in a 4-ary graphical model 
for Ch- The resulting graphical model is 8-ary. 



The hidden variables incident on C14 are partitioned into 
the sets 4''^(14) = {14} and 4"'''(14) = {12, 13}. Similarly, 
4''^(17) = {17} and 4"''' (17) = {15, 16}. The sole common 
constraint is thus C24. The projection of C24 onto 5*14 and 5*17 
has dimension 3 and the new 8-ary hidden variable 52,5 is 
defined by the generator matrix 



G 



14,17 



^25 Sl4 S27 

100 10 10 

010 01 00 

001 00 01 



(68) 



The local constraints C14 and Cn when defined over 525 rather 
than 5i4 and 6*17, respectively, are generated by 



(69) 





5*12 


Si3 


S26 




Sl5 


516 


S25 




" 1 





110 " 




' 1 





101 ' 


G'i4 — 


1 





111 




1 





111 







1 


100 







1 


100 



Finally, C24 is redefined over 6*25 and generated by 



G24 — 



525 5*20 5*23 

100 10 10 

010 00 01 

001 00 01 

000 01 01 



(70) 



while Ci4 and Cn are replaced by C25 which is equivalent to 
Ci4 n Ci7 and is generated by 



G2.5 — 



'S'25 512 5*13 515 Sie 

100 1 1 
010 1 1 
001 1 1 



(71) 



Note that the graphical model which results from the merging 
of Ci4 and C17 is 8-ary. Specifically, 6*24 is an 8-ary hidden 
variable while ri,(C24) — fc(C24) = 3 and fc(C25) = 3. 



2) Local Constraint Splitting: Local constraint splitting 
is simply the inverse operation of local constraint merging. 
Consider a local constraint Cj defined on the visible and hidden 
variables indexed by Iv{j) and /s(j), respectively. Suppose 
that Cj is to be split into two local constraints Cj^ and Cj^ 
defined on the index sets Iv{ji)^Is{ji) and Iv{j2)^Is{j2), 
respectively, such that Iv{ji) and Iv{j2) partition /y (j) while 
/s(ji) U Is{32) = Is{j) but Isiii) and Is{j2) need not be 
disjoint. Denote by Is{ji,j2) the intersection of Is{ji) and 
Is{j2)- Local constraint splitting proceeds as follows. For each 
Si, i G Is{ji 1 32), make a copy Si' of Si and redefine the local 
constraint incident on Si (which is not Cj) over both Si and 
Si'. Denote by Ig{ji,j2) an index set for the copied hidden 
variables. The local constraint Cj is then replaced by Cj^ and 
such that Cj^ is defined over Iviji) U Is{ji) and Cj^ is 



defined over 



Iv {32 ) U Is (J2 )\Is{h, 32 ) U (ji , 32 ) . 



(72) 



Following this split procedure, some of the hidden variables 
in Is{ji,j2) and /^(ji, ^'2) may have larger alphabets than 
necessary. Specifically, if the dimension of the projection of 
Cj, (C-,2) onto a variable 5„ i G Isi'ji^h) (i e Igiji-, h))^ is 
smaller than the alphabet index set size of Si, then Si can 
be redefined with an alphabet index set size equal to that 
dimension. 

The merged code in the example of the previous section 
C25 can be split into two codes; C14 defined on S'12, Sis, and 
S'24, and Ci7 defined on 6*15, 5*16, and 6*24'. The projection of 
^24 onto Ci4 has dimension 2 and ^24 can thus be replaced 
by the 4-ary hidden variable 5*14. Similarly, the projection of 
S'24' onto Ci7 has dimension 2 and S'24' can be replaced by 
the 4-ary hidden variable S17. 

3) Insertion/Removal of Degree-2 Repetition Constraints: 
Suppose that Si is a hidden variable involved in the local 
constraints Ci, and Ci^. A degree-2 repetition constraint is 
inserted by defining a new hidden variable Sj as a copy of Si, 
redefining Ci^ over Sj and defining the repetition constraint 
Cj which enforces Si = Sj. Degree-2 repetition constraint 
insertion can be similarly defined for visible variables. Con- 
versely, suppose that Cj is a degree-2 repetition constraint 
incident on the hidden variables Si and Sj. Since Cj simply 
enforces Si — Sj, it can be removed and Sj relabeled Si. 
Degree-2 repetition constraint removal can be similarly defined 
for visible variables. The insertion and removal of degree-2 
repetition constraints is illustrated in Figures |14(a)| and |14(b)| 
for hidden and visible variables, respectively. 



c. 



(a) 



(b) 



Fig. 14. Insertion and removal of degree-2 repetition constraints. 
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4) Insertion/Removal of Trivial Constraints: Trivial con- 
straints are those incident on no hidden or visible variables 
so that their respective block lengths and dimensions are zero. 
Trivial constraints can obviously be inserted or removed from 
graphical models. 

5) Insertion/Removal of Isolated Partial Parity-Check Con- 
straints: Suppose that Ci^ , . . . , Ci- are j g-ary repetition 
constraints (that is each repetition constraint enforces equality 
on g-ary variables) and let (3i-^, . . . , j3i G Fg be non-zero. 
The insertion of an isolated partial parity-check constraint is 
defined as follows. Define j + 1 new q-ary hidden variables 
S'ij , . . . , Si- and Sk, and two new local constraints Cp and Ck 
such that Cp enforces the g-ary single parity-check equation 



Sk 



(73) 



1=1 



and Ck is a degree- 1 constraint incident only on Sk with 
dimension 1. The new local constraint Cp defines the partial 
parity variable Sk and is denoted isolated since it is incident on 
a hidden variable which is involved in a degree-1, dimension 
1 local constraint (i.e. Ck does not constrain the value of Sk)- 
Since Cp is isolated, the graphical model that results from its 
insertion is indeed a valid model for C. Similarly, any such 
isolated partial parity-check constraint can be removed from 
a graphical model resulting in a valid model for C. 

As an example. Figure [15] illustrates the insertion and 
removal of an isolated partial-parity check on the binary sum 
of V-! and in a Tanner graph for Ch corresponding to (|49] l 
(note that Ch is self-dual so that the generator matrix defined 
in (|49] | is also a valid parity-check matrix for Ch)- 





Fig. 15. The insertion/removal of an isolated partial parity-check constraint 
on V7 and Vg in a Tanner graph for Ch . 
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